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16  Abstract 

Optimal  Doppler  velocity  estimation,  under  the  constraint  of  small  sample  size,  is  explored  for  a  standard  Gaussian 
signal  measurement  model  and  thematic  maximum  likelihood  (ML)  and  Bayes  estimation.  Because  the  model  considered 
depends  on  a  vector  parameter  [velocity,  spectrum  width,  and  signal-to-noise  ratio  (SNR)],  the  exact  formulation  of  an 
ML  or  Bayes  solution  involves  a  system  of  equations  that  is  neither  uncoupled  nor  explirit  in  form.  Historically,  iterative 
methods  have  been  the  most  suggested  approach  to  solving  the  required  equations.  In  addition  to  being  computationally 
intensive,  it  is  unclear  whether  iterative  methods  can  be  constructed  to  perform  well  given  a  small-sample  size  and  low 
signal  strength. 

This  report  takes  a  different  approach  and  seeks  to  construct  approximate  (ML  and  Bayes)  estimators  based  on  the 
notion  of  using  constrained  adaptive  models  to  deal  with  nuisance  parameter  removal.  A  Monte  Carlo  simulation  is  used 
to  determine  small-sample  estimator  statistics  and  to  demonstrate  true  performance  bounds  in  the  case  of  known  nuisance 
values.  Performance  comparisons  between  these  optimal  forms  and  other  standard  estimators  [pulse  pair  (PP)  and  a 
frequency  domain  wind  profder  (WP)  method]  are  presented.  Performance  sensitivity  of  the  optimal  algorithms,  with 
respect  to  uncertainty  in  the  values  of  model  nuisance  parameters,  is  explored  and  provides  the  foundation  for  the 
recommendation  to  seek  an  adaptive  method. 

An  adaptive  estimation  method,  closely  allied  with  the  derived  ML  and  Bayes  formulas,  is  developed  using  information 
theoretic  methods  to  constrain  the  adaptation  process.  The  improved,  near  optimal  performance  of  the  new  Doppler 
velocity  estimator  is  documented  by  comparison  to  derived  optimal  bounds  and  to  the  performance  of  the  PP  method. 
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ABSTRACT 


Optimal  Doppler  velocity  estimation,  under  the  constraint  of  small  sample 
size,  is  explored  for  a  standard  Gaussian  signal  measurement  model  and  thematic 
maximum  likelihood  (ML)  and  Bayes  estimation.  Because  the  model  considered 
depends  on  a  vector  parameter  [velocity,  spectrum  width,  and  signal-to-noise  ratio 
(SNR)],  the  exact  formulation  of  an  ML  or  Bayes  solution  involves  a  system  of  equa¬ 
tions  that  is  neither  uncoupled  nor  explicit  in  form.  Historically,  iterative  methods 
have  been  the  most  suggested  approach  to  solving  the  required  equations.  In  addi¬ 
tion  to  being  computationally  intensive,  it  is  unclear  whether  iterative  methods  can 
be  constructed  to  perform  well  given  a  small-sample  size  and  low  signal  strength. 

This  report  takes  a  different  approach  and  seeks  to  construct  approximate  (ML 
and  Bayes)  estimators  based  on  the  notion  of  using  constrained  adaptive  models 
to  deal  with  nuisance  parameter  removal.  A  Monte  Carlo  simulation  is  used  to 
determine  small-sample  estimator  statistics  and  to  demonstrate  true  performance 
bounds  in  the  case  of  known  nuisance  values.  Performance  comparisons  between 
these  optimal  forms  and  other  standard  estimators  [pulse  pair  (PP)  and  a  frequency 
domain  wind  profiler  (WP)  method]  are  presented.  Performance  sensitivity  of  the 
optimal  algorithms,  with  respect  to  uncertainty  in  the  values  of  model  nuisance 
parameters,  is  explored  and  provides  the  foundation  for  the  recommendation  to 
Beek  an  adaptive  method. 

An  adaptive  estimation  method,  closely  allied  with  the  derived  ML  and  Bayes 
formulas,  is  developed  using  information  theoretic  methods  to  constrain  the  adapta¬ 
tion  process.  The  improved,  near  optimal  performance  of  the  new  Doppler  velocity 
estimator  is  documented  by  comparison  to  derived  optimal  bounds  and  to  the  per¬ 
formance  of  the  PP  method. 
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1.  INTRODUCTION 


Mean  velocity  (i.e.,  first  spectral  moment)  estimation  is  central  to  pulsed- Doppler  weather- 
radar  processing.  However,  some  radars,  such  as  the  Terminal  Doppler  Weather  Radar  (TDWR) 
and  Airport  Surveillance  Radar  (ASR),  operate  under  conditions  that  test  the  limits  of  current 
capabilities.  For  example,  high  scan  rate  and  fine  (azimuth)  sampling  requirements  yield  a  small 
number  of  pulse  samples  per  gate1  (i.e.,  small-sample  size)  while  dear-air  conditions  result  in  low 
signal-to-noise  levels.  Hence,  coupled  with  general  interest  for  optimal  velocity  estimation,  specific 
needs  motivate  an  investigation  of  improved  performance  under  the  constraints  of  small-sample  size 
and  low  signal  strength. 

As  is  often  the  case,  the  issue  of  practicability  limits  the  degree  to  which  optimal  perfor¬ 
mance  can  be  attained.  Further  complicating  matters  in  this  report  is  an  assumed  small-sample 
constraint  that  makes  the  definition  of  optimality  itself  problematic.  Indeed,  the  natural  appeal 
to  the  (asymptotic)  optimality  of  maximum  likelihood  (ML)  could  fail;  verification  is  clearly  re¬ 
quired.  The  Cramer-Rao  (CR)  performance  bound  could  be  too  optimistic,  likely  unattainable  by 
any  estimator  and  therefore  useless  in  gauging  what  room  there  is  for  improvement.  Because  few 
theoretical  results  exist  to  serve  as  guides,  numerical  methods  typically  must  be  relied  upon  for  the 
relevant  analysis. 

This  report  presents  a  study  of  the  velocity  estimation  problem  and  provides  arguments  for 
frequency-domain  smoothing  (autocorrelation-lag  weighting)  as  a  means  of  achieving  improved  ve¬ 
locity  estimation  performance.  Although  the  emphasis  is  on  small-sample  analysis,  the  results  are 
not  restricted  to  the  small-sample  case.  Note  that  the  Bayes  (conditional  mean)  estimator  (defined 
by  the  usual  formulation)  minimizes  the  mean-squared  estimation  error  over  all  estimators.  This 
property  holds  regardless  of  sample  size  and  hence  rationalizes  the  use  of  Bayes  performance  as  a 
benchmark  for  optimality  in  this  small-sample  case.  Unfortunately,  even  assuming  the  conventional 
Gaussian  signal  model,  the  conditional  mean  is  computationally  impractical.  It  nevertheless  pro¬ 
vides  useful  insight  to  the  development  of  improved  (and  practicable)  frequency-domain  velocity 
estimators. 

At  the  heart  of  this  study  is  a  Monte  Carlo  analysis  examining  small-sample  performance  for 
selected  mean-frequency  estimators.  In  addition  to  the  pulse  pair  (PP)  estimator  and  a  typical 
Fast  Fourier  Transform  (FFT)  estimator  [i.e.,  a  periodogram  based  estimator  derived  from  wind- 
profiler  (WP)  literature],  maximum  likelihood  (ML)  and  Bayes  estimators  (these  two  based  on  a 
Gaussian  signal  model)  are  examined  and  compared.  Sample  size  is  fixed  to  a  small  (M  =  20) 
value,  and  performance  at  low  signal-to-noise  ratios  (SNR)  is  also  a  point  of  focus.  The  results  of 


^or  TDWR,  on  the  order  of  40  pulse  samples  are  coherently  processed  per  output  velocity  estimate; 
for  ASR-9,  on  the  order  ol  18  (although  data  sharing  has  been  used  to  extend  this  value  to  27). 
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these  analyses  are  used  to  recommend  an  optimal  smoothing  (weighting)  strategy  for  periodogram 
(autocorrelation-lag)  based  estimation. 

A6  stated,  the  motivating  interest  is  improved  velocity  estimation  for  Doppler  weather  radars, 
and  the  content  of  this  report  is  clearly  oriented  toward  weather-radar  processing.  However,  the 
results  of  Section  6  should  be  of  a  more  general  interest,  presenting  a  new  and  novel  method  for 
dealing  with  the  fundamental  problem  of  nuisance  parameter  removal  under  the  constraint  of  a 
limited  sample  size. 
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2.  BACKGROUND 


2.1  Statistical  Model 

For  this  report,  it  is  assumed  that  a  frequently  adopted  Gaussian  model  (for  single  source 
weather  returns  in  additive  noise)  is  appropriate  (see,  for  example,  Doviak  and  Zrnid  [1]).  This 
model  is  characterized  by  the  sample  covariance  relation 

rm  =  e-i^mr/A  +  (!) 


where  the  model  parameters  V  and  av  represent  mean  Doppler  velocity  and  spectrum  width,  A 
is  the  radar  wavelength,  and  S  and  Af  respectively  represent  signal  and  noise  power  magnitudes. 
Complex-valued  radar  returns,  Zr  =  [zo  Zi  ...za/-i]»2  corresponding  to  a  single  range  cell  are 
assumed  to  represent  M  equally-spaced  samples  (separated  in  time  by  pulse  separation  r)  of  a 
complex  Gaussian  process  with  covariance  matrix  R  =  E[ZZ t].  In  view  of  Equation  (1),  R  can  be 
given  the  parametric  representation 


R  =  D  [SG  +  A'ljD*  =  SDGD*  +  ATI, 


where 


D  =  D(u)  =  diag[l  e~3“  . 

‘  P( o) 

/*(!) 

p{M  -  1) 

G  =  G(a)  = 

P(  1) 

p(0) 

p(M  -  2) 

.  P(M-I) 

p{M  -  2)  ... 

p{  0) 

I  =  diag[l  1 . . 

•1], 

p(m )  =  e 


w  =  - 


vNyq 


a  = 


vNyq  ’ 


and 


VNyq  —  4t- 


(2) 


2Notationally,  a  superscript  “T”  will  be  used  to  indicate  a  matrix  transpose,  a  superscript  will 
be  used  to  indicate  complex  conjugate,  and  a  superscript  “f”  will  be  used  to  indicate  conjugate 
transpose. 
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It  is  assumed  that  Af  can  be  reliably  estimated  and  treated  as  known,  and  for  convenience 
S  and  A f  are  combined  into  an  unknown  SNR  77  =  o/'Af.  The  complex  data  vector  Z  is  therefore 
governed  by  the  probability  density 


p(Z  |  ©)  =  ,r-A'|Rr1e-ztR-lz, 


(3) 


and  estimation  of  the  Doppler  velocity  u  must  be  done  in  the  context  of  the  unknown  parameter 
vector  ©r  =  [w  a  t?]. 

2.2  Experimental  Design 

2.2.1  Weather-Radar  Simulation 

Weatherlike  Doppler  signals  were  simulated  as  described  by  Zrnic  [2].  For  all  simulations, 
digital  sampling  of  Gaussian  shaped  power  spectra  was  emulated,  including  Nyquist  folding,  until 
all  tail  values  with  power  greater  than  0.01  were  accounted  for.  Given  an  M  point  input  power  spec¬ 
trum,  Zrnil’s  method  generates  a  random  vector  of  M  consecutive  process  samples.  A  brief  study 
of  sample  estimator  performance  was  made  to  examine  (and  thereby  ensure  limits  fori  uncertainties 
due  to  power  spectral  sampling  and  Monte  Carlo  sair  pie  size.  As  a  result  of  this  investigation,  a 
1024  point  power  spectrum  representation  was  used  and,  from  the  corresponding  output  process 
vector,  a  block  of  20  contiguous  samples  was  extracted  for  use  as  a  data  sample  (the  remaining 
1004  points  were  discarded).  Also,  a  total  of  10,000  (independent)  realizations  were  used  for  each 
assessment  of  sample  mean  and  standard  error. 

Error  results  are  presented  in  both  normalized  and  unnormalized  form.  In  view  of  the  defini¬ 
tions  in  Equation  (2),  spectrum  width  and  standard  error  values  are  normalized  to  the  magnitude 
of  the  Nyquist  velocity  VNvq.  Bias  errors  are  normalized  to  the  magnitude  of  the  true  underlying 
signal  velocity.  Unnormalized  values  are  presented  in  the  context  of  weather-radar  processing  us¬ 
ing  a  10.4  cm  radar  operating  with  a  pulse  repetition  frequency  of  1000  s-1.  (The  unambiguous 
Nyquist  velocity  therefore  equals  26  m/s.) 

Provided  also,  for  reference,  are  curves  corresponding  to  the  CR  bounds  that  were  computed 
using  the  well-known  result3 


(4) 


3A  derivation  is  provided  as  an  appendix. 
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where  F  =  [fij]  represents  the  Fisher  information  matrix 


F  _  £  '^lnp(Z]0)^T  ^lnp(Z|0)y 

The  diagonal  entries  of  F  provide  the  desired  bounds: 

E[{ex  -  6tf]  >  7m,  (5) 

where  F"1  =  (Jij]  and  6,  is  any  unbiased  estimator  of  component  0,  of  0. 

2.2.2  Test-Signal  Parameters 

Performance  evaluations/comparisons  are  based  on  estimation  error  that  clearly  can  vary  as 
a  function  of  the  true  parameter  value.  Error  quantification  for  all  possible  values  of  the  (true) 
parameter  is  not  feasible.  Evaluations  are  based,  therefore,  on  comparisons  for  selected  parameter 
values. 

Target  velocities  of  0,  5  (0.192  t>/vy?)i  and  13  m/s  (0.500  v^vq)  are  used  in  the  present  evalu¬ 
ation.  Preliminary  studies  obtained  similar  error  profiles  for  velocities  in  the  range  0  to  0.5  v^yq. 
Velocities  greater  than  0.5  v^yq  are  not  examined  to  avoid  the  added  complexity  of  spectral  folding. 

Values  for  weather  spectrum  width  can  range  from  0  to  v^'yq.  For  coherent  processing,  only 
a  small  range  is  of  interest.  A  further  simplification  is  made  by  considering  only  extreme  points  of 
this  range  providing  two  classes  of  input  signal:  narrow  and  wide  spectrum  width.  Table  1  contains 


TABLE  1 

Model  Sample  Correlation  vs  Spectral  Width 


|  Sample  Correlation: 

m 

<T 

0.040 

0.100 

0.200 

i 

0.992 

0.952 

0.821 

2 

0.969 

0.821 

0.454 

3 

0.931 

0.641 

0.169 

the  correlation  values  for  the  first  three  lag  indices  in  the  model  [Equation  (2)]  corresponding 
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to  three  values  of  normalized  spectrum  width.  For  a  normalized  width  of  0.2,  there  is  significant 
decorrelation  between  samples  separated  by  two  or  more  places.  (Improved  estimation  using  higher- 
order  lag-product  statistics  would  seem  unlikely  beyond  this  point.)  The  normalized  value  0.2  is 
used  to  represent  wide  spectrum  width  signals.  Given  the  setting  of  the  simulation,  this  signal 
corresponds  to  a  weather  signal  with  a  spectrum  width  of  5  m/s.  At  the  other  end,  the  value  0.038 
(corresponding  to  a  1  m/s  spectrum  width)  is  representative  of  narrow  spectrum  width  signals. 

Initially,  performance  comparisons  are  made  for  discrete  SNR  values  in  the  range  -10  to  10  dB. 
These  (initial)  results  are  the  basis  for  a  later  shortening  of  the  examined  range  to  -4  to  10  dB. 
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3.  ESTIMATOR  EQUATIONS 


3.1  ML  and  BAYES  Formulations 

The  ML  solution  corresponding  to  Equation  (3)  is  obtained,  in  principle,  from  joint  maxi¬ 
mization  of  the  “likelihood”  function 

L(0)  =  -  In  |R|  -  Zt  R-1  Z,  (6) 

but  this  has  limited  practical  application  because  the  resulting  system  of  equations  is  neither 
explicit  nor  separable  with  respect  to  the  components  of  0.  The  Bayes  estimator  that  minimizes 
the  mean-squared  error  £[(u>  —  w)2]  is  the  mean  of  the  conditional  (posterior)  density  p(0|Z)  (see, 
for  example,  Van  Trees  [3])  where  the  posterior  density  is  derived  from  application  of  the  Bayes 
formula: 


p(0|Z)  = 


p(Z|0)p(0) 

/p(Z|0)p(0)d0 


(7) 


This  too  has  limited  practical  use  as,  notationally,  “d0”  =  udu  da  dr]”;  the  resulting  estimator 
requires  substantial  numerical  integration  to  cover  the  parameter  space  volume.  Approximations 
of  some  form  are  clearly  required  in  order  to  proceed  further.  The  approach  considered  here  first 
steps  back  and  examines  a  simpler  model  for  which  practical  (ML  and  Bayes)  solution  is  possible 
and,  later,  examines  adaptation  of  the  resulting  solutions  to  the  model  of  Equation  (3). 

Examining  the  (somewhat)  degenerate  situation  wherein  one  assumes  the  spectrum  width 
(<r)  and  signal- to- noise  (t/)  parameters  known  is,  of  itself,  very  useful.  Under  these  conditions  the 
aforementioned  computational  blocks  are  largely  avoided,  and  one  achieves  optimal  estimators  that 
form  a  foundation  for  further  analysis. 

In  the  case  of  ML  estimation,  knowing  a  and  rj  reduces  likelihood  function,  Equation  (6),  to 
the  term  ztR-xZ  defining  the  ML  solution 

M-\  M-l 

&ml  =  argmin  £  zi  l*.k  ZkeM'~k).  (8) 

w€[-»,»)  ,=o  k= 0 


In  the  above,  the  coefficients  7^*  are  the  elements  of  the  matrix  T  resulting  from  the  convenient 
redefinition 


r  =  T(<r,  7?)  =  [S G+AT]-1. 


(9) 
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Equation  (8)  is  not  explicit  for  u\  however,  a  change  of  variable  and  rearrangement  results  in  am 
interesting  frequency  domain  form: 


M- 1 

&ML  —  arg  min  Y  f  <„r)  e--'4*'"1 , 


more  simply, 


{M-l 

m=0 

-(D 

where  is  the  T-weighted  autocorrelation  estimate  defined  by 
M—m  —  1 

^  ^  ^  7»,i+m  2 i+m 

«=0 


(10a) 


(10b) 


(11) 


From  inspection  of  Equation  (10)  it  is  clear  that  a  solution  to  this  ML  equation  can  be  obtained 
by  implementing  a  standard  FFT  transform  of  the  weighted  autocorrelation  estimates  The 
transform  samples  Equation  (10)  with  a  resolution  of  2f/Nfft  {Nfft  >s  the  FFT  size)  and  thus, 
by  zero  filling,  computes  u>ml  to  a  desired  resolution. 

A  similar  computational  reduction  occurs  for  the  Bayes  formulation.  If  one  assumes  a  uniform 
prior4  for  w,  the  (one-dimensional)  Bayes  estimator  can  be  written  as 


/we-ztDrD'z<L, 

J  g— zIdTD’Z  (fcj 


(12) 


The  above  FFT  computation,  which  yields  the  ML  solution,  also  provides  the  exponent  in  Equa¬ 
tion  (12)  and  therefore  a  starting  point  for  computing  the  posterior  mean. 

Comment  1.  If,  in  Equation  (11),  7i,i+m  =  1,  then  riP  is  equivalent  to  the  (biased)  sam¬ 
ple  autocorrelation  estimate,  and  an  FFT  implementation  of  Equation  (10)  is  equivalent  to  the 
(windowed)  power  spectral  density  estimate  obtained  using  the  Bartlett  window.  Hence,  there  is 


4For  this  report  only  noninformative  priors  are  considered,  which,  for  u,  can  be  stated  as  p(u>)  = 
l/2ir,  -IT  <  W  <  7T. 
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also  an  equivalence  between  the  FFT  computation  for  Equation  (10)  and  the  periodogram  spectral 
estimate 

M- 1 

m=0 

The  resulting  ML  estimator  would  therefore  be  equivalent  to  the  Periodogram  Maximization  esti¬ 
mator  considered  by  Mahapatra  and  Zrnic  [4].  (The  apparent  contradiction  in  which  a  minimization 
of  the  power  spectrum  in  Equation  (10)  defines  the  ML  estimate  is  accounted  for  by  the  sign  of  the 
weights  7i,i+m-) 

Comment  2.  If  7,,,+m  =  7m,  then  Equation  (10)  implements  a  classic  smoothed  periodogram 
estimate.  However,  T  (the  inverse  of  a  Toeplitz  matrix)  is  only  asymptotically  Toeplitz  (though  T 
is  persymmetric:  7 =  'fM-jM-i)-  This  nevertheless  does  suggest  an  alternative  frequency  domain 
solution  and  approximation  based  on  implementation  of  a  smoothed  periodogram  estimate. 

3.1.1  Smoothed  Periodogram  Approximation 

For  this  report,  the  weighted  autocorrelation,  Equation  (11),  is  viewed  as  posing  no  computa¬ 
tional  difficulty  (the  results  presented  here  assume  such  an  implementation).  However,  a  Toeplitz 
approximation  to  T  provides  a  useful  heuristic  within  which  to  view  the  relevance  of  the  parame¬ 
ters  0  and  rf.  This,  in  turn,  may  suggest  approaches  to  the  more  general  case  where  o  and  rj  are 
unknown. 

In  treating  T  as  Toeplitz,  one  can  compute  smoothing  weights  by  taking  averages  along  the 
diagonals  of  T: 

_j  M—m  —  1 

=  M  -  m  -  1  ^  7,',+m 

t=0 

(The  minus  sign  is  introduced  here  so  that  the  ML  solution  of  Equation  (10)  can  be  associated  with 
finding  the  maximum  of  the  power  spectral  density  estimate.)  This  maximum  likelihood  window 
function,  indexed  by  rj  and  <7,  can  be  used  to  define  an  approximate  ML  solution 

M- 1 

uaml  =  arg  max  7mfm«',u'm,  (14) 

where  fm  represents  the  normal  (unweighted)  autocorrelation  estimate. 

The  relationship  between  7m  and  the  parameters  a  and  rj  is  documented  by  Figure  1  where 
frequency-domain  smoothing  windows  (Fourier  transforms  of  the  coefficients  7m;  interpolated  for 
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-71  0  7t 

0) 

Figure  1.  ML  matched  filter  wtndouis:  < r  and  tj  dependence.  Frequency  domain  smoothing 
windows  obtained  by  Fourier  transforming  the  weighting  coefficients  are  plotted  for 
three  values  of  weather  SNR  and  a  range  of  weather  spectrum  width  values. 
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clarity)  are  presented  for  three  SNR  values:  -10,  0,  and  10  dB,  and  a  range  of  normalized  spectrum 
widths.  At  10  dB  SNR  and  for  low  to  moderate  spectrum  widths  the  smoothing  windows  appear 
to  be,  practically  speaking,  square-window  averagers.  This  suggests  the  generalized  view  that  the 
spectrum  width  parameter  a  primarily  affects  window  width.  As  SNR  decreases,  the  dominant 
feature  change  is  a  scaling  of  the  windows  to  smaller  magnitudes  (although  shape  rounding  and 
bandwidth  broadening,  as  measured  by  3  dB  width,  are  clearly  evident).  It  is  helpful  to  add 
to  the  generalization  by  stating  that  the  windows  are  scaled  according  to  the  SNR  parameter  rj. 
Predictions  that  result  from  these  simplifications  are  as  follows.  Clearly  [from  Equation  (10)], 
scaling  the  window  does  not  affect  ML  estimation,  and  hence  the  value  of  the  SNR  parameter 
T)  would  not  be  expected  to  have  a  strong  effect  on  the  performance  of  an  ML  implementation. 
However,  in  the  case  of  the  Bayes  implementation,  scaling  coupled  with  exponentiation  implicitly 
introduces  a  form  of  signal  isolation.  That  is,  by  its  nonlinear  nature,  the  exponentiation  in 
Equation  (12)  enhances  (isolates)  large  spectral  lines  in  the  smoothed  periodogram,  and  scaling  of 
the  window  (by  rj)  modulates  this  isolation  by  adjusting  the  power  of  exponentiation.  Hence,  one 
would  anticipate  that,  at  least,  the  bias  of  the  Bayes  algorithm  would  be  affected  by  the  q  value 
used.5 


3.2  Time-Domain  Processing:  Pulse  Pair  (PP) 

The  PP  estimator  is  standard  in  weather  radar  applications,  and  its  formulation  can  be 
obtained  from  many  sources.  Originally  proposed  in  the  context  of  the  independent  sampling  of 
PP  measurements  (for  example,  see  Miller  et  al.  [5],  time-domain  formulas  for  PP  estimators  have 
been  routinely  applied  to  vector  measurements  such  as  those  represented  by  the  sample  Z.  For  the 
purpose  of  this  report,  the  PP  estimator  is  defined  by  the  equation 


M— 2 

upP  =  arg  ^  zi  z.+i-  (15) 

i=0 

When  M  =  2,  this  also  defines  the  ML  estimate,  as  can  be  seen  by  comparison  with  Equation  (8). 
Clearly,  as  a  increases  and  the  correlation  between  samples  decreases  one  would  also  predict  upp 
to  approach  the  performance  of  Cjml. 

3.3  Frequency-Domain  Processing:  Wind  Profiling  (WP) 

Among  the  various  published  frequency-domain  methods  is  one  that  has  been  developed  for 
use  in  WP  networks — a  task  with  inherently  low  SNR  values  (see  May  and  Strauch  [6]).  The  WP 


5The  presence  of  a  uniform  noise  floor  in  the  spectral  density  estimate  induces  a  bias  toward  the 
value  zero  when  the  mean  of  the  density  is  computed. 
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estimator  of  May  and  Strauch  [6]  is  periodogram  derived  and  has  led  to  general  claims  of  improved 
Doppler  mean  estimation  in  the  case  of  low  SNR;  hence,  it  is  of  interest  for  the  present  report. 

The  estimator,  WP,  was  derived  with  one  major  departure  from  the  description  given  in  May 
and  Strauch  [6]  (their  First  Moment  or  FM  algorithm).  Computational  aspects  of  this  algorithm 
are  presented  below  by  way  of  a  comparison  with  the  proposed  (one-dimensional)  ML  and  Bayes 
algorithms.  Note,  however,  that  there  is  one  important  difference  between  the  WP  algorithm  defined 
here  and  the  algorithm  described  in  May  and  Strauch  [6].  The  WP  network  makes  generous  use  of 
periodogram  averaging  as  a  means  of  stabilizing  the  periodogram  estimates  (Bartlett’s  procedure; 
see  the  next  section).  Because  the  emphasis  of  this  report  is  on  estimation  using  a  fixed  small- 
sample  size,  no  such  averaging  is  possible.  (This  report  does  not  consider  the  possibility  of  data 
averaging  across  range  gates  or  neighboring  radials.) 

Estimate  Stabilization.  At  the  heart  of  the  WP  algorithm  is  an  implementation  of  Bartlett’s 
procedure  for  estimating  the  power  spectral  density.  That  is,  the  data  segment  for  analysis  (length 
M  data  segment  corresponding  to  a  single  range  cell  observation)  is  divided  into  q  equal  length  sub- 
segments,  each  of  which  is  used  to  obtain  an  M/g-length  periodogram  estimate.  The  q  power  spec¬ 
tral  estimates  are  averaged  to  improve  the  stability  of  the  power  spectral  estimate.  The  Bayes  im¬ 
plementation  [Equation  (12)]  also  can  be  viewed  as  having  a  power  spectral  estimator — a  smoothed 
periodogram — at  the  heart  of  its  procedure.  Because  Bartlett’s  method  per  se  is  not  appropriate  for 
the  small  single-sample  case  (the  primary  interest  here)  it  may  be  argued  that  smoothing  windows 
therefore  are  necessary  to  stabilize  the  estimates.  (It  should  be  remarked  that,  generally  speaking, 
the  goals  of  mean  Doppler  velocity  estimation  and  power  spectral  density  estimation  are  not  one 
and  the  same.  Hence,  general  arguments  for  improving  power  spectral  estimation  do  not  necessarily 
carry  over  to  improved  velocity  estimation.) 

Signal  Isolation.  After  obtaining  an  estimate  of  the  power  spectral  density,  the  WP  method 
proceeds  with  an  ad  hoc  attempt  to  isolate  signal  from  noise.  A  noise  floor  for  the  spectral  estimate 
is  determined,  and  censoring  (zeroing)  of  all  spectral  coefficients  beyond  the  first  crossing  of  the 
noise  floor,  when  proceeding  from  that  frequency  index  with  maximum  power,  is  performed.  The 
noise  power  level  is  also  subtracted  from  the  remaining  interval  of  nonzero  values,  and  a  mean 
frequency  value  is  computed  by  constructing  a  density  from  the  remaining  spectral  coefficients  and 
computing  its  mean.  As  previously  mentioned  for  the  Bayes  implementation,  the  combination  of 
window  scaling  and  exponentiation  can  be  given  the  heuristic  interpretation  of  signal-from-noise 
isolation. 
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4.  PERFORMANCE  WITH  KNOWN  a  AND  v 


This  section  presents  a  baseline  analysis  corresponding  to  a  one-dimensional  parameter  space 
(i.e.,  a  and  r?  assumed  known).  These  Monte  Carlo  results  provide  optimal  performance  measures 
for  each  method.  In  the  case  of  the  Bayes  implementation,  results  for  known  a  and  r\  also  provide 
a  greatest  lower  bound  for  the  standard  error  of  estimating  mean  Doppler  velocity.6  In  later 
sections,  the  Bayes  curves  determined  here  are  employed  with  the  label  “Bayes  Bound”  for  velocity 
estimation. 

4.1  Zero  Mean  Velocity 

Figure  2  presents  the  simplest  comparison:  estimation  of  a  zero-mean  Doppler  weather  target 
(eliminating,  for  the  moment,  the  contribution  of  bias7  in  standard  error  comparisons).  The  figure 
plots  standard  error  vs  input  SNR  for  two  cases:  a  narrow  input  spectrum  width  ( a  —  0.038  v^yq) 
and  a  wide  input  spectrum  width  (cr  =  0.192  v^yq).  The  corresponding  CR  lower  bounds  are 
included  in  each  panel. 

4.1.1  Narrow  Spectrum  Widths 

ML,  PP,  and  \VP.  For  narrow  spectrum  widths  ML,  PP,  and  WP  estimates  all  exhibit 
similar  functional  relationships  with  respect  to  SNR.  A  uniform  ranking  of  these  estimators  (across 
SNR)  cannot  be  deduced  from  the  data:  ML  is  clearly  best  at  high  SNR  (7  to  10  dB)  values  but 
the  WP  method  appears  relatively  better  at  low  SNR  values  (less  than  0  dB). 

CR  Bound.  For  narrow  spectrum  widths,  the  CR  bound  is  well  below  the  error  curve  of 
any  of  the  above  three  estimators — lower  by  nearly  a  factor  of  two  at  the  high  SNR  end.  This 
discrepancy  between  CR  bound  and  observed  performance,  which  is  even  more  substantial  for  low 
SNR  values,  could  erroneously  lead  to  speculation  that  much  improved  performance  is  possible. 

Bayes  Bound.  The  curve  for  the  Bayes  standard  error  shows  the  CR  bound  to  be  overly 
optimistic.  However,  the  Bayes  estimator  clearly  exhibits  a  performance  gain  for  SNR  values  in 
the  range  0  to  10  dB  (at  10  dB,  the  Bayes  standard  error  is  respectively  0.72,  0.66,  and  0.52% 


Comparisons  involving  the  ML,  Bayes,  and  WP  performance  statistics  must  concede  an  error  (bias) 
that  results  from  the  finite  length  FFT  implementation.  For  these  frequency  domain  estimators,  the 
sampling  resolution  o{2/Nfft  va iyq  is  exhibited  as  a  bias  in  the  range  ±1/Nfft  vNyq  (and  therefore 
maps  an  interval  about  the  error  values  reported  here).  All  frequency  domain  computations  in  this 
report  were  computed  using  an  FFT  length  of  64. 

7For  this  zero  mean  Doppler  signal,  each  of  the  three  frequency  domain  methods  was  found  to 
exhibit  an  estimated  bias  below  the  resolution  defined  by  the  64  point  FFT  implementation  (i.e., 
<  1/32),  regardless  of  the  r)  value. 
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Figure  2.  Optimal  velocity  estimation  standard  error  for  zero-mean  Doppler  weather. 
Both  panels  plot  estimated  standard  error  for  four  methods  of  Doppler  velocity  estimation: 
ML,  Bayes,  WP  derived,  and  PP.  The  upper  panel  corresponds  to  weather  having  a  narrow 
spectrum  width  and  the  lower  corresponds  to  weather  having  a  wide  spectrum  width. 
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that  of  the  ML,  PP,  and  WP  estimators;  at  0  dB,  the  Bayes  standard  error  is  respectively  0.58, 
0.60,  and  0.54%  that  of  the  others).  For  SNR  values  less  than  0  dB,  adequate  separation  of  signal 
from  noise  becomes  more  difficult  and  the  Bayes  standard  error  drops  as  its  estimate  becomes 
increasingly  biased  toward  zero.  Comparisons  among  the  estimators  for  SNR  values  below  0  dB 
should  therefore  include  bias  error,  and  this  is  included  in  the  sections  to  follow. 

4.1.2  Wide  Spectrum  Widths 

For  wide  spectrum  widths  also,  the  general  observations  of  the  previous  section  apply,  but  of 
particular  note  for  these  input  signals  is  the  following.  ML  and  PP  performances  are  predictably 
close,  although  ML  is  better  at  higher  SNR  values  and  appears  to  approach  the  CR  bound  (as  SNR 
increases),  whereas  the  PP  curve  appears  to  plateau  at  a  level  distinctly  above  the  CR  bound.  From 
inspection  of  Table  1,  one  may  surmise  that  at  a  normalized  spectrum  width  of  0.2,  the  decorrelation 
between  samples  is  such  that  very  little  improvement  can  be  realized  from  using  higher  lag  terms 
in  the  estimation  process.  In  confirmation,  there  is  less  difference  between  performance  of  PP  and 
ML  and  the  CR  and  Bayes  bounds;  however,  note  that  WP  stands  alone  as  a  clearly  suboptimal 
estimate.  This  exceptionally  marked  degradation  in  WP  performance  persists  to  high  SNR  values 
and  confirms  a  wide-spectrum-width  weakness  identified  by  other  investigators.  The  ML  estimator 
appears  to  achieve  the  lower  Bayes  bound  for  SNR  values  in  the  5  to  10  dB  range,  and  both  ML 
and  Bayes  improve  upon  PP  performance  over  this  range  (at  10  dB,  the  Bayes  standard  error  is 
respectively  0.98,  0.81,  and  0.37%  that  of  the  ML,  PP,  and  WP  estimators).  For  SNR  values  0  to 
5  dB,  ML  and  PP  performances  are  essentially  equivalent,  and  both  depart  appreciably  from  the 
Bayes  bound  as  SNR  approaches  0  dB  (at  0  dB,  the  Bayes  standard  error  is  respectively  0.83,  0.83, 
and  0.64%  that  of  ML,  PP,  and  WP). 

As  a  secondary  note,  one  should  observe  that  there  is  no  conflict  in  the  fact  that  the  CR 
bound  and  Bayes  error  curves  cross  (there  is  an  implied  crossing  of  these  two  curves  for  the  narrow 
spectrum  width  case  as  well).  This  only  serves  as  a  reminder  that  the  CR  bound  applies  to  the 
performance  of  unbiased  estimators,  which  the  Bayes  estimator,  generally  speaking,  is  not. 

4.2  Nonzero  Mean  Velocity 

Standard  error  and  bias  results  for  nonzero  mean  velocities  (r  =  5  m/s  and  v  =  13  m/s)  and 
narrow  and  wide  spectrum  width  cases,  as  above,  are  presented  in  Figure  3.  Standard  error  results 
are  summarized  in  the  upper  half  of  each  panel,  and  bias  results  are  summarized  in  the  lower. 

4.2.1  Standard  Error 

For  SNR  values  greater  than  0  dB,  there  is  general  agreement  (within  the  resolution  of  the 
Monte  Carlo  parameterization)  with  the  results  of  Figure  2.  Below  0  dB,  there  is  a  notable  departure 
(meet  evident  for  the  Bayes  results)  due  to  inclusion  of  bias  error. 
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4.2.2  Bias 


For  both  narrow  and  wide  spectrum  widths  there  is  a  breakdown  of  all  four  methods  as  SNR 
decreases  below  0  dB.  For  SNR  values  greater  than  0  dB,  the  bias  of  each  estimator  appears  to 
be  within  acceptable  limits  and  generally  near  zero.  A  large  proportion  of  the  improved  (standard 
error)  performance  of  the  B~.yes  method,  for  SNR  <  0,  is  at  the  cost  of  increased  bias  (toward  zero). 
Generally,  the  PP  estimator  appears  to  have  the  best  (i.e.,  smallest)  bias  performance.  Departure 
from  zero-bias  performance  in  the  narrow  spectrum  width  case  occurs  earlier  (i.e.,  at  higher  SNR 
values)  in  comparison  to  the  wide  spectrum  width  case. 

PP  and  ML  results  do  not  appear  to  change  appreciably  when  the  weather  velocity  is  changed 
from  5  to  13  m/s.  Although  the  theoretical  CR  bound  does  not  depend  on  weather  velocity,  this 
is  not  true  for  the  bound  provided  by  the  Bayes  estimate.  Performance  for  the  Bayes  and  WP 
estimators,  which  both  compute  a  spectral  mean,  deteriorate  when  weather  velocity  is  increased  to 
13  m/s — a  performance  loss  due  to  spectral  folding.  (Interestingly,  moving  weather  velocity  from 
5  to  13  m/s  has  its  most  significant  bias  effect  in  increasing  that  of  the  PP  estimator.) 

4.3  Summary 

4.3.1  Narrow  Spectrum  Widths 

For  narrow  spectrum  widths  ML,  PP,  and  WP  methods  all  have  similar  performance  charac¬ 
teristics  (over  the  range  of  SNR  values  examined).  Nevertheless,  it  can  be  argued  that  ML  provides 
better  performance  at  higher  SNR  values.  Clearly,  the  indication  is  that  higher  SNR  values  are 
required  to  bring  out  a  decisive  advantage  here  from  the  ML  implementation;  a  more  extensive 
Monte  Carlo  analysis  (including  higher  SNR  values)  would  be  needed  to  measure  the  extent  of 
these  improvements.  As  SNR  decreases  away  from  0  dB,  all  methods  begin  to  fail  although  the 
bias  performance  of  PP  is  uniformly  best.  The  CR  bound  is  much  lower  than  the  performance  of 
all,  but  the  Bayes  results  show  the  information  bound  to  be  overly  optimistic  for  this  small-sample 
(M  =  20)  case.  The  asymptotic  optimality  of  ML  estimation  was  also  demonstrated  to  be  of  no 
consequence  for  this  small-sample  case.  The  Bayes  estimator  demonstrates  a  markedly  improved 
performance,  but  for  SNR  values  below  zero,  this  is  at  least  in  part,  at  the  expense  of  increased 
bias.  Bias  does  not  appear  to  contribute  appreciably  to  estimation  error  for  SNR  values  above 
0  dB.  All  four  methods,  however,  do  exhibit  notable  bias  for  SNR  values  in  the  range  -5  to  0  dB. 
Bias  comparisons  appear  to  always  favor  PP  estimation. 

4.3.2  Wide  Spectrum  Widths 

At  wider  spectrum  widths  ML  and  PP  are  for  the  most  part  similar,  but  the  ML  results 
show  a  slight  improvement  at  higher  SNR  values  (greater  than  5  dB).  Although  Bayes  performance 
represents  the  optimum,  ML  and  PP  are  close  to  its  bound  (compare  to  the  narrow  spectrum  case); 
all  three  estimators  perform  near  the  CR  bound  (again,  in  comparison  to  the  narrow  spectrum 
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case).  However,  the  WP  method  dearly  has  an  undesirable  performance  at  wide  input  spectrum 
widths.  An  explanation  for  this  is  offered  in  Section  5.2. 
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5.  PERFORMANCE  WITH  ARBITRARY  a  AND  77 


The  previous  section  indicated  that  a  small  performance  improvement  (as  measured  by  stan¬ 
dard  error)  might  be  obtained  using  an  ML  formulation  (relative  to  PP  and  at  high  SNR  values 
only),  and  that  a  much  improved  (and  optimal)  performance  might  result  from  a  Bayes  implemen¬ 
tation.  The  task  at  hand,  now,  is  to  preserve  these  performance  gains  while  addressing  the  issue 
that  a  and  77  are  never  known  exactly. 

In  the  remainder,  the  Bayes  and  ML  algorithms  will  process  data  using  approximate  values 
for  the  parameters  a  and  77  and,  in  that  sense,  represent  suboptima]  algorithms.  (However,  for 
convenience,  the  labels  Bayes  and  ML  will  still  be  used.)  Before  evaluating  practical  approaches  to 
estimating  a  and  77  it  is  useful  to  examine  the  effect  of  arbitrary  a  and  77  values  on  Bayes  and  ML 
performance.  In  other  words,  for  velocity  estimation,  first  consider  whether  it  is  important  that 
either  a  or  77  be  known  at  all.  This  examination  is  made  by  keeping  one  parameter  fixed  at  its 
known  value  while  varying  the  other  among  appropriate  candidate  values. 

5.1  Sensitivity  to  Incorrect  77 

Figures  4  and  5  continue  the  narrow/wide  spectrum  width  analysis  of  before  by  examining 
performance  when  data  are  processed  assuming  either  one  of  two  fixed  77  values:  0  or  10  dB.  These 
choices  represent  logical  test  cases  in  the  sense  of  asking  whether  reasonable  performance  can  be 
obtained  by  categorically  treating  the  data  as  either  low  or  high  SNR  data.  Figure  4  considers 
ML  estimation  for  the  narrow  and  wide  spectrum  width  case;  Figure  5  repeats  the  analysis  for 
the  Bayes  estimator  (comparisons  for  Doppler  weather  targets  of  5  m/s  (0.192  v^vq)  and  13  m/s 
(0.500  ujvyq)  a-re  presented).  In  this,  and  all  following  figures,  the  Bayes  performance  curve  for  the 
case  of  known  a  and  77  (Section  4)  is  repeated  as  the  “Bayes  Bound.”  The  results  for  PP  estimation 
are  also  reproduced  for  continued  comparison. 

5.1.1  ML  Algorithm 

In  the  case  of  narrow  spectrum  width  weather  (Figures  4(a)  and  4(b)],  the  parameter  77  as 
predicted  (Section  3.1.1)  has  no  apparent  effect  on  ML  performance.  This  is  not  quite  the  case, 
however,  for  wide  spectrum  width  weather  [Figures  4(c)  and  4(d)]  where  mismatch  between  assumed 
and  actual  SNR  results  in  increased  estimation  error.  As  will  be  seen  also  in  the  case  of  the  Bayes 
algorithm,  using  a  10  dB  value  for  the  SNR  parameter  results  in  a  performance  loss  (relative  to 
PP)  for  input  SNR  values  below  6  dB.  Hence,  the  view  of  a  smoothing  window  shape  independent 
of  77  is,  in  places,  an  oversimplification. 

5.1.2  Bayes  Algorithm 

From  Figures  5(a-d),  it  is  clear  that  the  Bayes  implementation,  using  an  arbitrary  fixed  77,  has 
a  substantially  altered  performance.  In  neither  case  (0  or  10  dB  window)  did  performance  match 
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Figure  4-  A/I  algorithm  performance  sensitivity:  effect  of  signal-to-noise  parameter  r] 
(a)  v  =  5  m/s,  <rw  =  0.038 


24 


NORMALIZED  STANDARD  ERROR 


STANDARD  ERROR  (rn/s) 


ard  error 


BIAS  (m/s)  STANDARD  ERROR  (m/s) 


r|  SENSITIVITY:  ML 


90675-9 


0  269 


0  231 


0.192 


0  154 


H  0115 


H  0  077 


H  0  038 


Figure  f.  ML  algorithm  performance  sensitivtly:  effect  of  signal-to-notse  parameter  ij. 
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Figure  f.  ML  algorithm  performance  sensitivity:  effect  of  signal-to-noise  parameter  rj. 
(d)  v  =  13  m/s,  <rw  =  0.192  vNyq. 


27 


NORMALIZED  STANDARD  ERROR 


STANDARD  ERROR  (m/s) 


-4  -2  0  2  4  6  8  10 


SNR  (dB) 


Figure  5.  Bayes  algorithm  performance  sensitivity:  effect  of  signal-to-noise  parameter 
T).  (a)  v  =  5  m/s,  <rw  =  0.038  VNyq. 
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Figure  5.  Bayes  algorithm  performance  sensitivity:  effect  of  signal-to-noise  parameter 
r).  (c)  v  =  5  m/s,  aw  =  0.192  v^v,. 
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Ftgurt  5.  Bayes  algorithm  performance  sensitivity:  effect  of  signal-to-noise  parameter 
T).  (d)  v  =  13  m/s,  aw  =  0.192  vNvq. 
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the  optimum  Bayes  Bound  for  all  input  SNR  values;  nor  did  it,  at  least,  uniformly  improve  upon 
PP  performance. 

For  the  narrow  spectrum  width  case  of  Figure  5(a),  there  is  an  (apparently)  anomalous 
crossing  of  the  Bayes  Bound  curve  by  the  performance  curve  assuming  r?  =  0  dB.  This  is  not  a 
contradiction  because  the  Bayes  Bound  optimality  applies  only  in  the  sense  of  average  performance 
against  the  totality  of  all  possible  weather  target  velocities.  Hence,  it  is  possible  to  obtain  lower 
standard  errors  for  this  particular  weather  velocity  of  5  m/s.  (Clearly,  the  estimator  u  =  5  m/s, 
an  extremely  degenerate  case,  has  zero  standard  error  and  bias  at  this  test  point.)  With  the  0  dB 
window,  note  the  severe  compromise  in  estimator  bias.  This  bias  is  reflected  in  the  (standard  error) 
performance  lo66,  which  is  most  notable  at  higher  velocities  [see  Figure  5(b)].  Hence,  for  narrow 
spectrum  width  weather,  processing  the  data  with  an  assumed  SNR  of  0  dB  incurs  the  penalty  of 
increased  bias  result!  ig  from  inadequate  signal  isolation. 

Assuming  an  t]  of  0  dB  does  make  sense  for  wide  spectrum  width  weather  [Figures  5(c)  and 
(d)].  For  low  input  SNR  values,  the  optimal  Bayes  Bound  is  matched,  and  at  higher  input  SNR 
values,  performance  appears  to  be  no  worse  than  that  of  PP.  The  bias  of  this  implementation  is 
also  very  similar  to  that  of  PP. 

At  the  other  extreme,  processing  the  data  assuming  q  =  10  dB  appears  to  better  PP  perfor¬ 
mance  in  the  case  of  narrow  spectrum  width  signals  but  at  the  cost  of  a  performance  loss  (vs  PP) 
at  low  input  SNR  values  and  wider  spectrum  widths  [Figures  5(c)  and  5(d)].  The  bias  performance 
with  a  10  dB  window,  if  anything,  does  improve  upon  that  of  the  ideal  (q  known)  case. 

5.2  Sensitivity  to  Incorrect  a 

For  this  series,  the  known  values  for  q  were  used  in  the  algorithm  and  a  set  of  values  for  the 
parameter  <7,  ranging  above  and  below  the  true  va'ue,  were  tested.  In  contrast  to  varying  r/,  the 
range  of  a  values  tested  did  not  appreciably  change  the  bias  results.  Therefore,  bias  curves  for  this 
set  are  not  presented.  Figures  6  and  7  summarize  the  standard  error  results  for  ML  and  Bayes 
algorithms  respectively. 

For  the  narrow  input  spectrum  [Figures  6(a),  6(b),  7(a),  and  7(b)],  a  values  ranging  from 
atruc/4  to  5<7true8  were  tested;  for  the  wide  input  spectrum  [Figures  6(c),  6(d),  7(c),  and  7(d)], 
values  ranging  from  <7(rue/5  to  9/5 a(rue9  were  tested.  In  plotting  the  results,  the  performance 
region  spanned  by  underestimating  o  is  marked  in  white;  the  performance  region  spanned  by 
overestimating  a  is  indicated  with  dark  shading.  Performance  curves  for  each  of  the  tested  values 
are  included  as  thin  lines;  the  lowest  curve  in  each  panel  always  represents  the  performame  when 


8Values  0.25,  0.5,  1.0,  2,  3,  4,  and  5  m/s  {atTXlC  =  1  m/s)  were  examined. 

9Values  1,  2,  3,  4,  5,  6,  7,  8,  and  9  m/s  ( atTXie  =  5  m/s)  were  examined. 
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Figure  6.  ML  algorithm  performance  sensitivity:  effect  of  spectrum  width  parameter  a , 
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Figure  7.  Bayes  algorithm  performance  sensttivity:  effect  of  spectrum  width  parameter 
a.  (a)  v  —  5  m/s,  aw  =  0  038  .  (b)  v  =  13  m/s,  =  0.038  rjvyf 
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Ftgurt  7.  Bayes  algorithm  performance  sensitivity:  effect  of  spectrum  width  parameter 
c.  (c)v  —  5  m/s,  ffw  =  0.192  (d)v  =  13  m/s,  <7W  =  0.192  v^yq. 
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there  is  a  perfect  match  between  the  assumed  a  value  and  that  of  the  weather.  The  heavy  curve 
in  each  panel  is  PP  performance  for  reference. 

In  general,  significant  deviation  from  optimal  performance  occurs  for  a  parameter  values 
grossly  in  error  (using  values  3,  4,  and  5  m/s  when  atrUe  =  1  m/s,  and  using  values  1  and  2 
m/s  when  <r<rue  =  5  m/s).  Because  ML  performance,  assuming  a  and  r\  known,  is  close  to  PP 
performance,  not  knowing  the  correct  value  for  a  generally  results  in  a  significant  performance  loss 
[Figures  6(a-d)].  For  narrow  spectrum  widths  and  SNR  values  in  the  range  0  to  5  dB,  a  marked 
performance  gain  still  exists  for  the  Bayes  estimator,  regardless  of  the  value  used  for  the  parameter 
<J.  The  potential  compromise  at  larger  SNR  values,  however,  indicates  that  the  algorithm  does  not 
perform  well  with  a  a  value  that  is  too  distant  from  the  underlying  true  value. 

Comment.  The  WP  method,  prior  to  noise  subtraction  and  censoring,  can  be  viewed  as 
a  Bayes  algorithm  wherein  the  parameter  a  is  assumed  to  have  an  arbitrary  narrow  width  (no 
frequency  domain  smoothing  is  done).  The  plots  in  Figure  7(c)  and  7(d)  confirm  this  notion  as  it 
can  be  seen  that  the  Bayes  performance  curves  approach  those  of  the  WP  method  (see  Figure  3) 
when  a  narrow  o  is  assumed  but  the  weather  possesses  a  wide  spectrum  width. 

5.3  Summary 

It  is  unclear  whether  arbitrary  fixed  values  for  o  and  r)  can  guarantee  an  estimator  performance 
that  is  consistently  better  than  that  of  PP.  Bayes  performance  relies  heavily  on  knowledge  of  both 
a  and  77,  ML  performance,  more  appreciably  on  o. 

In  the  smoothing  window  view  of  Section  3.1.1,  it  is  not  enough  to  smooth  arbitrarily.  It 
is  important  that  the  data  be  processed  with  an  amount  of  smoothing  that  matches  correlation 
strength  between  samples.  Treating  the  data  as  being  highly  correlated  (low  o)  demonstrated  a 
severe  performance  loss  when  weather  signals  in  fact  had  wide  spectrum  widths.  This  explains 
why  periodogram  based  algorithms,  such  as  WP,  do  not  perform  well  given  wide  spectrum  width 
weather  (too  much  weight  is  given  to  higher  lag  products).  Unfortunately,  treating  the  data  as 
being  largely  uncorrelated  (high  o)  resulted  in  a  corresponding  performance  loss  with  input  signals 
having  narrow  spectrum  widths  and  high  SNR  values.  Nevertheless,  the  indications  are  good  that 
improved  performance  (relative  to  PP)  is  possible  with  a  smoothing  methodology  requiring  less 
than  perfect  knowledge  of  o  and  t/. 

The  ML  implementation  at  best  only  matches  PP  performance;  any  performance  gain  at 
higher  SNR  levels  would  clearly  be  compromised  by  inaccurate  knowledge  of  o.  (Hence,  the  ML 
implementation  will  not  be  considered  further  in  this  report.)  A  successful  Bayes  implementation 
will  require  an  adaptive  selection  of  smoothing  (weighting)  coefficients,  which  is  the  focus  of  the 
next  section. 
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6.  PERFORMANCE  WITH  ADAPTIVE  <7  AND  v 


With  an  emphasis  on  the  estimation  of  u>,  one  can  adopt  the  view  whereby  a  and  r\  are 
treated  as  nuisance  parameters.  Here,  there  exists  a  natural  Bayesian  method  of  treatment:  removal 
through  expectation.  This  logical  recourse  unfortunately  does  not  lead  to  algorithm  simplification 
(in  the  present  case),  requiring  as  much  computation  as  that  needed  for  solving  the  vector  parameter 
problem.  (It  appears  that  the  integrations  required  for  nuisance  parameter  removal  must  be  done 
numerically  and  also  require  the  data  vector  to  be  in  hand.)  Coupled  with  this  observation,  the 
results  of  the  previous  sections  motivate  an  approach  that  seeks  to  adapt  the  computational  form 
of  Equation  (12)  using  (suboptimal)  estimates  for  a  and  r).  Simple  estimation  of  o  and  t]  (using 
method  of  moments  estimates  described  below)  and  direct  substitution  into  Equation  (9)  and  (12) 
(sample  by  sample)  were  found  to  yield  a  performance  clearly  worse  than  that  of  PP.  This  is  not 
(entirely)  unexpected  because,  like  u>,  a  and  tj  are  being  estimated  from  a  small  sample  and  (u>- 
performance)  sensitivity  to  large  deviations  from  the  true  o  and  T)  values  can,  on  average,  do  more 
harm  than  good.  Clearly,  an  approach  is  needed  whereby  the  estimated  values  b  and  r),  substituted 
into  Equation  (12)  via  Equation  (9),  are  suitably  constrained  to  minimize  penalties  that  result  from 
their  inaccuracy.  This  section  describes  one  such  approach  that  was  found  to  be  successful. 

The  general  processing  strategy  considered  is  as  follows.  Given  a  data  sample  Z,  suboptimal 
estimates  of  a  and  r)  are  first  computed  and  used  to  select  a  weighting  coefficient  array  ( matched 
filter)  r  from  a  small,  fixed,  and  predetermined  family  of  matrices;  the  chosen  matrix  is  used  to 
process  the  data  as  per  Equation  (12)  and  provide  a  velocity  estimate.  The  number  of  matrices 
required,  their  coefficient  specification,  and  the  criteria  used  to  choose  from  among  them  is  the 
subject,  then,  of  this  present  section. 

6.1  Constrained  Inverse  Filter  (T)  Selection 

This  section  focuses  on  the  nuisance  pair  (cr,  tj)  as  an  element  of  a  (parameter)  set  $  that  is 
assumed  to  be  partitioned  into  K  disjoint  pieces,  i,e.. 


A'-l 

$  =  where  D  V j  =  0  (i  ^  j). 

*= o 


Each  region  is  assigned  an  optimal  representor  (<r,  T])k  €  ty,  and  a  corresponding  weighting 
matrix  T*  is  computed  as  per  the  definition  [Equation  (9)].  A  representor  for  a  given  *  is 
determined  by  means  of  a  minimization  involving  the  directed  divergence  [7]  (i.e.,  Kullback-Leibler 
information) 


I(p  :<?)  =  £„ 


pm 

9(Z). 


(16) 
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The  divergence  I(p  :  q)  has  a  useful  interpretation  as  a  distance,10  measuring  an  ability  to  dis¬ 
criminate  pi_  ility  density  p  (and,  hence,  its  corresponding  model)  from  alternative  q  (note: 
I(p  '•  ?)  >  0  and  HP  '■  ?)  =  0  p  =  q).  Furthermore,  the  divergence  [Equation  (16)]  is  easily 
computed  for  the  Gaussian  case  [Equation  (3)]:  if  Ti  and  r2  correspond,  respectively,  to  densities 
Pi  and  ps,  then 

I(pi :  Pa)  =  log  |r,|  -  log  |r2|  +  tr(r2r,-1  -  /). 

A  somewhat  natural  approach,  then,  is  to  define  the  optimal  representor  for  as  that  pair  (a,  rj) 
whose  corresponding  density  [Equation  (3)]  minimizes  the  discrimination  information  (divergence) 
averaged  over  all  densities  q  corresponding  to  parameter  pairs  in  the  set 


r])k  =  argmin  /  /(<fc:p*)d£,  (17) 

(o,r,)e+  J*k 


where  T>k  is  the  density  corresponding  to  (o’,  r/)k  and  £  is  an  index  to  pairs  (<7,  ij)  in  .  In  words, 
as  representor  for  the  set  select  that  density  (model)  which,  in  an  average  sense,  is  least 
distinguishable  from  the  feasible  densities  (models)  in  'J*.  Note,  no  restriction  is  made  that  the 
point  (<r,  rj)k  must  be  contained  in 

Example.  Consider  the  (trivial)  case  where  it  is  assumed  that  $  =  (0,0.25]  x  (0,20]  and 
K  =  1.  That  K  =  1  is  specified  assumes  all  data  can  be  processed  with  the  weighting  coefficients 
corresponding  to  one  (arbitrary)  set  point  (o,  T7)0.  For  this  case  it  is  an  easy  matter  to  solve  Equa¬ 
tion  (17), 11  and  one  obtains  the  solution  (cr,  77 )0  =  (0.165,  8.4).  Figure  8  shows  the  performance 
of  the  resulting  estimation  algorithm.  Although  performance  is  near  optimal  for  wide  spectrum 
widths  and  high  SNR  levels  (locations  near  the  set  point),  uniform  improvement  for  the  entire  range 
of  parameter  values  in  is  absent,  and  performance  is  severely  compromised  in  places  as  well.  One 
must  conclude  that  this is  too  large  to  be  represented  by  one  set  of  weighting  coefficients. 

Figure  9(a)  is  a  plot  of  the  divergence  for  the  above  example.  Divergence  is  near  zero  in  the 
vicinity  of  the  set  point  and  curves  upward  (away  from  zero)  as  weather  spectrum  width  and  SNR 
deviate  from  the  set  point.  Note,  also,  that  the  curvature  is  not  uniform  in  direction — the  most 
acute  curvature  occurs  with  respect  to  spectrum  width  as  weather  SNR  becomes  large.  Clearly,  the 
goal  is  to  devise  an  adaptive  method  that  has  a  composite  divergence  surface  as  flat  and  as  near 


10Technically,  the  divergence  fails  as  a  true  distance  because  it  does  not  satisfy  the  triangle  inequal¬ 
ity.  This  failing,  however,  does  not  prevent  its  use  in  the  present  application. 

11  Optimization  was  achieved  by  means  of  an  implementation  of  the  Nelder-Mead  modified  polytope 
(direct-search)  algorithm  (see,  for  example,  Gill  et  al.  [8]). 
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Figure  8.  Bayes  algorithm  performance:  suboptimal  a  and  t]  using  best  single-weighting 
coefficient  set.  (b)  tJ  =  13  m/s,  aw  =  0.038  v/vy(J- 
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Figurt  9.  Divergence  minimization  for  three  simple  partitions.  Computed  directed  diver¬ 
gence  between  an  optimal  compartment  representor  (Equation  (17)]  and  the  feasible  model 
densities  of  its  compartment  is  plotted.  (The  divergence  scale  is  arbitrary.) 


zero  as  possible.  The  logical  extension,  of  course,  is  to  seek  improvement  by  solving  the  situation 
for  K  >  1.  Two  simple  extensions,  a  two  and  three  compartment  model,  are  also  illustrated  in 
Figure  9(b-c).  Here,  the  divergence  surface  for  each  compartment  is  computed  as  tv'*  7  ^uation  (17); 
although  not  yet  striking,  the  notion  of  flattening  the  composite  surface  i.,  dear. 

There  is,  however,  a  problem  with  the  simple  extension  of  Equation  (17)  to  K  >  1;  Equa¬ 
tion  (17),  as  written,  has  the  implicit  assumption  that  one  would  (could)  distinguish  perfectly 
between  the  opposing  hypotheses  as  to  whether  data  Z  were  more  Auth  parameters  from 

the  set  tyjt  vs  alternatives  from  its  complement  -  'i/k.  Clearly,  for  K  >  1,  Equation  (17) 

must  be  modified  to  account  for  the  accuracy  of  the  decision  process  that  matches  the  data  to  one 
of  the  subsets  ¥  k: 

(*»*?)*  d=  arg  min  {.Pr(  select  9k  |  #*)  •  f  I(q^:pk)d^ 

(r’.v)e'b  t 

+  Pr(  select  ^k  |  $/tc)  •  <  /(^  :  pk)d£^  .  (18) 

In  this  way,  the  desire  to  optimally  match  (a,  q)k  to  ty*  is  balanced  against  the  probability  that 
the  data  will  be  incorrectly  matched  with  ilk. 

6.2  Suboptimal  Estimation  of  a  and  q 

The  calculations  for  Equation  (18)  require  specification  of  a  set  of  decision  rules  and  the 
corresponding  statistics  (estimators)  to  be  used;  however,  once  this  is  done  all  information  required 
to  solve  Equation  (18)  is  present  and  the  selection  of  T*,  ( k  =  0,  ...,K  -  1),  can  be  completed 
prior  to  the  processing  of  any  data.  Therefore,  although  computationally  formidable,  the  solution  of 
Equation  (18)  is  quite  achievable.  This  section  will  focus  on  decision  rules  using  easily  computable 
suboptimal  estimates  for  o  and  q.  For  suboptimal  estimates,  an  appeal  to  the  assumed  correlation 
structure  of  Equation  (1)  and  (method  of  moments)  estimates  for  a  and  q,  derived  from  a  weighted 
least-squares  fit  to  the  data,  can  be  used.  The  least-squares  equations 


M- 1  M-i  j  A/-1 

^2  wm  In  |fm|  =  wm  M'S  +A'$m]  -  -tr 2ct2  ^  wmm2 


(19a) 


M- 1  A/-1  ,  A/  —  1 

^2  «tmm2ln  |rm|  =  tnmm2ln[.S  +  Ar(5m)  -  -tr2cr2  ^2  wmmA 


(19b) 


can  be  used  to  solve  for  a  and  S  (which  provides  an  estimate  for  q)\  the  lag  estimates  fm  are 
unweighted  here  and  the  (least-squares)  weights  wm  can  be  used  to  weight  or  select  the  lag  estimates 


46 


to  be  used.  If  wm  =  6m-i ,  for  example,  a  becomes  equivalent  to  the  common  single-lag  spectrum 
width  estimator  [see  Zrnic  [9],  his  Equation  (5.1)].  For  the  results  of  this  report, 


def 
tnm  = 


! 


1 

0 


for  m  <  4 
otherwise 


was  used  for  estimation  of  a  and  tj. 
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Figure  10.  A  partition  design  using  simple  decision  rules. 


6.3  Decision  Rules  and  a  Partition  for  $ 

At  this  point  it  is  necessary  to  be  more  specific  regarding  the  description  of  the  V  k's-  For  the 
remainder,  it  is  assumed  that  $  is  the  parallelepiped  of  the  previous  example:  (0,0.25]  x  (0,20].  To 
simplify  definition  of  the  \p*’s  and  to  provide  decisions  based  on  simple  rules,  consider  a  partition 
derived  from  a  sequence  of  threshold  tests — first,  for  f]  and  second,  for  a.  The  general  design  is 
illustrated  in  Figure  10.  A  threshold  test  of  r)  against  (yet  undetermined)  SNR  values  divides  $  into 
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columns  of  (yet  undetermined)  width.  Each  column  i6  further  subdivided  along  spectrum  width 
values  (also  to  be  determined)  and  a  requirement  is  imposed  whereby  increased  partitioning  of  a 
column  with  respect  to  spectrum  width  requires  correspondingly  high  values  for  SNR,  (The  surface 
curvature  of  Figure  9(a)  suggests  that  more  detailed  representation  is  desired  for  high  SNR  values 
than  low.)  To  simplify  implementation,  each  new  column  to  the  right  is  allowed  one  additional 
division  with  respect  to  spectrum  width.  The  placement  of  SNR  and  spectrum  width  thresholds, 
the  selection  of  (a,  r/)k  values,  and  specification  of  K  are  all  unknowns  to  be  determined. 

For  K  fixed,  Equation  (18)  can  be  used  to  identify  optimal  values  for  all  threshold  boundaries 
and  all  (tr,  rj)k  pairs.  A  Jv -compartment  summed  divergence  error  can  be  defined  by  summing  the 
divergence  error  in  Equation  (18)  over  each  set  in  the  partition  of  The  A'- compartment  summed 
divergence  is  monotonic  (nonincreasing)  in  K.  Certainly  adding  extra  compartments  in  Figure  10 
can  only  lower  the  total  divergence  error.  For  example,  optimal  threshold  placement  would  force 
new  compartments  to  become  degenerate  (collapse  to  nothing)  if  they  could  not  improve  the  overall 
error  value;  lower  resolution  compartments  to  the  left  would  get  squeezed  by  higher  resolution 
compartments  from  the  right  if  the  data  and  estimators  for  a  and  T)  could  support  finer  levels  of 
partition.  Hence,  as  K  increases,  the  A'-compartment  summed  divergence  error  (bounded  below  by 
zero)  must  converge.  A  stopping  criteria  can  be  established  for  selecting  K  by  arguing  diminishing 
returns  with  further  increases  in  K.  In  this  way,  A',  optimal  threshold  placement,  and  optimal  set 
point  values  can  be  obtained. 

Figure  11  plots  the  A'-compartment  summed  divergence  for  the  partition  scheme  illustrated 
in  Figure  10.  Evaluation  of  Equation  (18)  was  approximated  by  using  a  Monte  Carlo  simulation 
to  obtain  a  mean  and  standard  deviation  characterization  for  the  b  and  t)  of  Equation  (19),  and 
a  Gaussian  approximation  was  used  to  evaluate  Pr{  select  'P/t|'P*)  and  complement.  Based  on  the 
results  presented  in  Figure  11,  K  =  15  was  selected  for  continued  analysis.  The  corresponding 
thresholds  and  set  point  values  for  K  =  15  are  summarized  in  Table  2  and  illustrated  in  Figure  12. 

In  Figure  12,  optimal  set  point  locations  are  indicated  by  a  labeled  (square)  dot.  The  convex 
hull  of  the  set  points  in  ^  is  illustrated  by  shading.  The  most  striking  result  of  the  optimization  is 
that  the  best  set  point  for  a  region  'P*  is  not  necessarily  contained  within  <?*.  The  convex  hull  of  the 
set  points  represents  the  constraint  required  of  a  and  f]  to  balance  the  effect  (on  velocity  estimation) 
of  their  uncertainty.  Figure  13  plots  the  sequence  of  convex  hulls  for  the  values  of  K  considered 
in  Figure  11.  As  K  increases,  the  convex  hull  expands  but  there  is  a  fundamental  limitation  to  its 
eventual  extent.  To  increase  the  extent  of  the  limiting  hull  requires  more  precise  estimators  for  a 
and  r]  or,  equivalently,  more  data.  The  limiting  hull  will  always  be  strictly  contained  within  <P:  for 
the  hull  to  be  equivalent  to  (i.e.,  cover)  *P  implies  that  perfect  estimation  of  a  and  rj  is  possible. 

6.4  Adaptive  Estimator  Performance 

The  adaptive  procedure  defined  by  Table  2  (Figure  12)  was  used  to  process  simulated  data  as 
in  the  previous  sections.  The  results  for  narrow  and  wide  spectrum  width  weather  are  presented 
in  Figure  14. 
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For  narrow  spectrum  width  weather,  Panels  a  and  b,  the  adaptive  method  demonstrates  a  near 
uniform  improvement  on  the  performance  of  PP.  The  performance  is  not  only  better  than  that  of 
PP,  it  is  much  closer  to  the  optimum  bound  established  in  earlier  sections.  The  slight  deterioration 
at  the  higher  SNR  values  may  be  due,  in  part,  to  the  decision  to  use  the  15  compartment  partition. 
Figure  13  indicates  that  the  21  compartment  partition  added  refinement  specific  to  higher  SNR 
values;  therefore,  some  further  improvement  at  higher  SNR  values  may  be  possible.  Note,  also, 
that  in  parallel  with  approaching  the  standard  error  bound  for  velocity  estimation,  the  adaptive 
estimator  also  approaches  the  bias  performance  of  the  optimal  Bayes  (bound)  estimator. 

For  wide  spectrum  width  weather,  Panels  c  and  d,  the  room  for  improvement  was  not  as 
evident  (except  at  low  and  very  high  SNR  values).  Nevertheless,  the  adaptive  method  demon¬ 
strated  performance  improvement  relative  to  PP  at  low  and  high  SNR  values — those  regions  where 
improvement  relative  to  optimal  performance  was  most  likely. 
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TABLE  2 


Optimal  Thresholds  and  Set  Point  Values  ( K  =  15) 


Compartment 

Set  Point 

Upper  T) 
Threshold 

Upper  a 
Threshold 

0 

(0.0983,  2.386) 

1.367 

- 

1 

(0.1095,  3.892) 

3.545 

0.0909 

2 

(0.1659,  3.794) 

- 

3 

(0.1041.  6.440) 

7.047 

0.0549 

4 

(0.1452,  7.082) 

0.1279 

5 

(0.1921,  6.960) 

- 

6 

(0.0907,  9.554) 

12.006 

0.0380 

7 

(0.1317,  9.723) 

0.0903 

8 

(0.1667,  10.203) 

0.1476 

9 

(0.2039,  10.916) 

- 

10 

(0.0811,  13.093) 

0.0354 

11 

(0.1186,  13.822) 

0.0821 

12 

(0.1554,  13.927) 

0.1252 

13 

(0.1848,  15.046) 

0.1844 

14 

(0.2155,  15.438) 

- 

The  data  of  the  previous  figure  was  combined  with  measurements  at  additional  weather 
spectrum  widths  to  produce  Figure  15,  which  illustrates  estimator  performance  as  a  function 
of  weather  spectrum  width  for  fixed  levels  of  SNR.  There  is  an  almost  uniform  improvement  in 
performance  relative  to  PP  and  this  improvement  for  a  fixed  SNR  level  is  seen  to  apply  across  the 
range  of  spectrum  widths  considered.  The  improvement  is  most  striking  for  the  two  low  SNR  levels 
(Panels  a  and  b).  Improvement  does  not  require  a  corresponding  performance  loss  at  higher  SNR 
levels — the  adaptive  method  continues  to  improve  performance  there  as  well. 

6.5  Summary 

Applying  adaptive  procedures  to  situations  with  small-sample  sizes  is  a  difficult  task.  This 
section  has  shown  that  by  adopting  suitable  constraints,  an  adaptive  method  could  be  developed 
for  the  small  sample  velocity  estimation  problem. 

Performance  for  the  adaptive  (Bayes)  estimator  is  close  to  the  optimal  performance  bounds 
established  earlier.  Significant  improvement  relative  to  PP  was  established. 
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Figure  ]/.  Bayes  algorithm  with  adaptive  a  and  rj:  15  compartment  weighting  coefficient 
set  (I).  (a)  v  =  5  m/s,  crw  =  0.038 
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Figure  H.  Bayes  algorithm  with  adaptive  <j  and  r):  15  compartment  weighting  coefficient 
set  (I).  (b)  v  =  13  m/s,  <rw  =  0.038  vNy9. 
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Figure  14-  Bayes  algorithm  wtlh  adaptive  a  and  r?.  15  compartment  weighting  coefficient 
set  (I).  (c)  7=5  m/s,  aw  =  0.192  vNyq. 
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Figure  H .  Bayes  algorithm  with  adaptive  a  and  tj:  15  compartment  weighting  coefficient 
set  (I).  (d)  v  =  13  m/s,  cw  =  0.192 
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7.  CONCLUSIONS 


Each  of  the  previous  sections  included  a  brief  summary  and  these  statements  will  not  be 
repeated  here.  Some  concluding  remarks  are,  nevertheless,  in  order. 

This  report’s  primary  objective  has  been  to  examine  the  challenge  of  Doppler  velocity  esti¬ 
mation  when  confronted  with  a  small-sample  size.  Applying  a  generally  accepted  model  for  the 
measurement  of  (meteorologically  generated)  Doppler  signals,  lower  bound  performance  limitations 
were  first  established.  These  bounds  were  shown  to  be  clearly  different  (greater)  than  those  pro¬ 
vided  by  standard  CR  analysis,  but  room  for  improvement  (relative  to  the  standard  PP  estimator) 
was  nevertheless  indicated.  Optimal  velocity  estimation,  under  the  assumed  model,  by  definition 
requires  the  joint  estimation  of  a  vector  parameter.  Previous  attempts  at  such  optimal  estimation 
have  been  hampered  by  the  technical  difficulties  that  arise  in  solving  the  complicated  system  of 
equations  that  result.  The  potential  of  approximate  methods  that  seek  to  treat  some  of  the  vector 
parameters  as  fixed  quantities  was  explored.  Sensitivity  to  these  approximating  assumptions  was 
studied  and  insight  into  the  relative  importance  of  these  nuisance  parameters  was  provided.  An 
adaptive  method  was  proposed.  The  new  method  does  not  require  excessive  computations — nothing 
more  extensive  than  previously  proposed  FFT  methods.  The  final  adaptive  estimation  scheme  was 
shown  to  provide  near  optimal  performance  for  a  span  of  SNR  and  spectrum  widths  likely  to  be 
associated  with  meteorological  signals. 
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APPENDIX  A 
Proof  of  Equation  (4). 


Proposition  A.l  Let  ZT  =  [ z0zi  be  a  complex  random  vector  with  Gaussian  density 

p(Z|0)  = 

where  R  =  R(O)  is  the  M  X  M  covariance  matrix  and  QT  =  [#o  #1  •  •  •  #n-i]  *s  a  real-valued 
parameter  vector.  Then,  the  Fisher  Information  matrix  F  =  defined  by 

„  „  f/£>lnp(Zl©)\r  /Slnp(Z!©)>' 

F  =  £  {  a©  )  {  a@—) 


can  be  obtained  equivalently  from 


fid  =  tr 


dR 

dOi 


R 


-l 


(A.l) 


Proof:  Define  the  unit  vectors  of  order  n  as 


'  1  ’ 

o 

f - 

o 

0 

,  e2  = 

1 

,  .  .  .  ,  €n 

...  o 

O 

i 

o 

[.J 

and  the  elementary  matrix  Et<}  (of  order  nxnjas 
=  e.ej. 


For  an  arbitrary  matrix  A  =  [a(iJ],  the  following  identities  are  standard  (see  Graham  [10],  for 
example): 


atiJ  =  e\  Ae,, 


(A.2a) 


Cl 


(A.2b) 


ee  a*j  a'->  e,€J  - 

•  j  «  j 


and 


trA  =  efAe*. 

I 


(A.2c) 


To  prove  Equation  (A.l),  begin  by  taking  the  logarithm  of  the  density, 
In p(Z|0)  =  —M  In  Jr  —  In  |R|  -  Z^R^Z, 


and  compute  the  derivative,  term  by  term,  with  respect  to  0.  Only  the  latter  two  terms  are  of 
importance.  For  the  first  of  these,  dln|R|/30, 


ain  |R| 
ddk 


=  EE 

«  j 

-  EE 

«  j 

-  EE 


ain|R|8rjj 

dritj  dOk 


1  d\R\drtJ 

|R|  drifj  aek 
J  • 

r*  aek  rince 


d\R\ 

dR 


tr 


5R'I 

dekr 


(A.3) 


where  the  last  line  follows  from  application  of  Equation  (A. 2): 


dr,t  j 

aek ' 


For  the  second  term,  note  that 

Z^R_1Z  =  tr{Z^R-‘Z}  =  tr{ZZ^R-1} 


from  which  it  follows  that 

dZ^  R-1Z  atr{ZZtR"'} 

ao  ~  ae 
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Continuing, 


atrlZZtR"1}  _  ^  „  dtrJZZtR-1}  9r,tj 

-  dTt  ]  d6k 


dBk 


*  J 


f^ZZ^R-'l  drtiJ 

-  dr,4  }  dh 


where  the  last  line  follows  from  the  result 


dR 


-l 


=  -R-'^jR*1 


Additional  rearrangement  results  in 


Combining  Equations  (A. 3)  and  (A.4)  provides  the  intermediate  result 


(A-4) 


91np(Z|0) 

dBk 


=  "{R"  f^B-",<zzt -RJ}- 


(A.5) 


The  next  step  is  to  evaluate  the  expectation.  From  the  definition  of  the  Fisher  Information  matrix 
and  Equation  (A.5), 


Fij  =  E 


91np(Z|©)  9 In p(Z|0) 


OB,  dB, 

=  E  tr{R-1|5iR-I(ZZt-R)}tr|R-,^R-,(Zzt-R) 
Working  with  the  terms  inside  the  expectation  and  again  using  (A. 2): 

tr  |r_1  R-1  (ZZ^  -  R)|  trjR-1  |^R-'(Zzt  -R)| 
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=  >'  (R"  W, R'' (ZZ'  - R)} "  {<zzt  - R)R"  w,  R“  } 

=  R_1  (zzt  -  R)e* ^(zzt  -  R) R-1  R-1< 


Now, 


(Zzt  -R)c*cf(Zzt  -  R) 


is  an  M  x  M  matrix  with  nm^  entry  given  by 


(znzk  ~  rn,k)(zlzm  ~  rl,m) 


for  which  the  expectation  is  commonly  known  (see  Miller  [11],  for  example): 


E{{zn zk  -  rnik)(zlzm  -  rl.m )}  = 


Therefore, 


£{(ZZt-R)e*er(ZZt-R)}  =  Rr,ifc. 


Finally,  using  Equations  (A.6)  and  (A.7)  one  obtains 


E  {E  f£R_‘  <ZZ'  -  R)e»  efizri  -  R)H-'  R->e,  j 


■  EE^'f  R-f  R-^« 


=  tr 


which  is  the  desired  result. 
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